In my last article, I introduce a pretty easy way to extract images from word document. Today, I will share another method to extract text from word document with C#, VB.NET. In our daily work, word document always plays an indispensable role. It has powerful functions to edit the text with different formats, such as font, header and footer, comments, hyperlink and so on. At the same time, it is just because this reason, we need to extract only the content without its format in a .txt file, while not usingdocument.SaveToFile method to save Word to Text directly.
Easy way to extract text from word document with C#, VB.NET
I am very happy that Spire.Doc, an MS Word component, can be my best hand to finish this task. Using Spire.Doc, I only need three simple steps to realize the text extraction function. If necessary, please don’t forget to Freely Install Spire.Doc on system.
Procedure
Step1. Create a new project
1. Create a new project in Visual Studio and set its Target framework to be .NET Framework 4.
2. Add Spire.Doc DLL as reference.
3. Add below using at the top of the method.
Easy way to extract text from word document with C#, VB.NET
I am very happy that Spire.Doc, an MS Word component, can be my best hand to finish this task. Using Spire.Doc, I only need three simple steps to realize the text extraction function. If necessary, please don’t forget to Freely Install Spire.Doc on system.
Procedure
Step1. Create a new project
1. Create a new project in Visual Studio and set its Target framework to be .NET Framework 4.
2. Add Spire.Doc DLL as reference.
3. Add below using at the top of the method.
C#
using System.IO;
using Spire.Doc;
using Spire.Doc.Documents;
VB.NET
Imports System.IO
Imports Spire.Doc
Imports Spire.Doc.Documents
Step2. Extract text from word document with C#, VB.NET
1. Load a Word document from system.
C# Code:
Document doc = new Document();
doc.LoadFromFile(@"D:\michelle\JaneEyre.doc", FileFormat.Doc);
VB.NET Code:
Dim doc As New Document()
doc.LoadFromFile("D:\michelle\JaneEyre.doc", FileFormat.Doc)
2. Extract text from word document.
C# Code:
//new a stringBuilder to extract text from word document
StringBuilder sb = new StringBuilder(); //extract text from word document
foreach (Section section in doc.Sections)
{
foreach (Paragraph paragraph in section.Paragraphs)
{
sb.AppendLine(paragraph.Text);
}
}
VB.NET Code:
'new a stringBuilder to extract text from word document
Dim sb As New StringBuilder()
'extract text from word document
For Each section As Section In doc.Sections
For Each paragraph As Paragraph In section.Paragraphs
sb.AppendLine(paragraph.Text)
Next
Next
Step3. Save the text to a .txt file and launch the .txt file.
C# Code:
//write the text of word document into a txt file
File.WriteAllText(@"result.txt", sb.ToString());
//launch the text file
System.Diagnostics.Process.Start(@"result.txt");
VB.NET Code:
'write the text of word document into a txt file
File.WriteAllText("result.txt", sb.ToString())
'launch the text file
System.Diagnostics.Process.Start("result.txt")
Preview